Methods for Implementation of an Active Archive in an Archiving System and Managing the Data in the Active Archive

ABSTRACT

According to the disclosure, a unique and novel archiving system that provides one or more application layer partitions to archive data is disclosed. Embodiments include an active archive including a fixed storage. The active archive can create application layer partitions that associate the application layer partitions with portions of the fixed storage. Each application layer partition, in embodiments, has a separate set of controls that allow for customized storage of different data within a single archiving system. Further, embodiments of methods for ensuring storage capacity in the active archive and the application layer partitions within the active archive is also disclosed.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent ApplicationSer. No. 60/977,761, filed Oct. 5, 2007, entitled “METHODS FORIMPLEMENTATION OF AN ACTIVE ARCHIVE IN AN ARCHIVING SYSTEM AND MANAGINGTHE DATA IN THE ACTIVE ARCHIVE,” Attorney Docket No. 040252-004200US,which is hereby incorporated herein in its entirety.

BACKGROUND OF THE INVENTION

Embodiments of the disclosure generally relate to storage systems and,more specifically, but not by way of limitation, to archiving storagesystems.

An archiving storage system is used by one or more applications orapplication servers to store data for longer periods of time, forexample, one year. Governments and other organizations often require thestorage of certain types of data for long periods. For example, theSecurities and Exchange Commission (SEC) may require retention offinancial records for three or more months. Thus, entities that have tomeet these storage requirements employ archiving systems to store thedata to a media allowing for long-term storage. However, at present,current archiving systems suffer from inadequacies.

Archiving systems in general do not have an easily accessible storagesystem that can allow a user to quickly retrieve archived data. Further,archiving systems generally allow requirements to be applied only overthe entire archive. These requirements or controls ensure the data isstored under the guidelines provided by the outside organization, forexample, SEC guidelines. However, some organizations may have data thatis covered by more than one outside organization. Thus, some controlsfor the archive may relate to one outside organization's guidelines, forexample, the SEC guidelines, while other controls may relate to adifferent outside organization, for example, Food and DrugAdministration (FDA) guidelines. To compensate for the discrepancy inguidelines, the organization is forced to use the strictest guidelinesor buy two archiving systems. The lack of customizability provides aless effective archiving system.

It is in view of these and other considerations not mentioned hereinthat the embodiments of the present disclosure were envisioned.

BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments of the present disclosure are described in conjunctionwith the appended figures:

FIG. 1 is a block diagram of an embodiment of a removable cartridgestorage system;

FIG. 2 is a hardware block diagram of an embodiment of an archivingsystem including one or more removable cartridge storage systems;

FIG. 3 is a functional block diagram of an embodiment of an archivingsystem;

FIG. 4 is a block diagram of embodiments of an archival managementsystem and an archiving system;

FIGS. 5A-C are block diagrams of embodiments of an archiving systemproviding multiple, independent file systems;

FIGS. 6A-B are block diagrams of embodiments for data structuresrepresenting application layer partitions;

FIGS. 7A-C are bar diagrams of an embodiment of an archiving systemshowing storage of data in the archiving system;

FIGS. 8A-B are yet other bar diagrams of embodiments of an archivingsystem showing stubbing of files to reduce storage usage in thearchiving system;

FIG. 9 is a block diagram of an embodiment of a database and one or moredata structures for stubbing files in an archiving system;

FIG. 10 is a flow diagram of an embodiment of a method for stubbingfiles in an archiving system;

FIG. 11 is another flow diagram of an embodiment of a method forstubbing files in an archiving system;

FIG. 12 is a flow diagram of an embodiment of a method for determiningif stubbing is required in the archiving system;

FIG. 13 is another flow diagram of an embodiment of a method fordetermining if stubbing is required in the archiving system;

FIG. 14 is yet another flow diagram of an embodiment of a method fordetermining if stubbing is required in the archiving system;

FIG. 15 is a flow diagram of an embodiment of a method for dynamicallyaltering the amount of storage capacity in two or more application layerpartitions of an archiving system.

In the appended figures, similar components and/or features may have thesame reference label. Further, various components of the same type maybe distinguished by following the reference label by a dash and a secondlabel that distinguishes among the similar components. If only the firstreference label is used in the specification, the description isapplicable to any one of the similar components having the same firstreference label irrespective of the second reference label.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

The ensuing description provides exemplary embodiment(s) only, and isnot intended to limit the scope, applicability or configuration of thepossible embodiments. Rather, the ensuing description of the exemplaryembodiment(s) will provide those skilled in the art with an enablingdescription for implementing an exemplary embodiment. It beingunderstood that various changes may be made in the function andarrangement of elements without departing from the spirit and scope ofthe possible embodiments as set forth in the appended claims.

Specific details are given in the following description to provide athorough understanding of the embodiments. However, it will beunderstood by one of ordinary skill in the art that the embodiments maybe practiced without these specific details. For example, circuits maybe shown in block diagrams in order not to obscure the embodiments inunnecessary detail. In other instances, well-known circuits, processes,algorithms, structures, and techniques may be shown without unnecessarydetail in order to avoid obscuring the embodiments. In some embodiments,a computing system may be used to execute any of the tasks or operationsdescribed herein. In embodiments, a computing system includes memory anda processor and is operable to execute computer-executable instructionsstored on a computer readable medium that define processes or operationsdescribed herein.

Also, it is noted that the embodiments may be described as a processwhich is depicted as a flowchart, a flow diagram, a data flow diagram, astructure diagram, or a block diagram. Although a flowchart may describethe operations as a sequential process, many of the operations can beperformed in parallel or concurrently. In addition, the order of theoperations may be re-arranged. A process is terminated when itsoperations are completed, but could have additional steps not includedin the figure. A process may correspond to a method, a function, aprocedure, a subroutine, a subprogram, etc. When a process correspondsto a function, its termination corresponds to a return of the functionto the calling function or the main function.

Moreover, as disclosed herein, the term “storage medium” may representone or more devices for storing data, including read only memory (ROM),random access memory (RAM), magnetic RAM, core memory, magnetic diskstorage mediums, optical storage mediums, flash memory devices and/orother machine-readable mediums for storing information. The term“machine-readable medium” includes, but is not limited to portable orfixed storage devices, optical storage devices, wireless channels andvarious other mediums capable of storing, containing or carryinginstruction(s) and/or data.

Furthermore, embodiments may be implemented by hardware, software,firmware, middleware, microcode, hardware description languages, or anycombination thereof. When implemented in software, firmware, middlewareor microcode, the program code or code segments to perform the necessarytasks may be stored in a machine-readable medium such as a storagemedium. A processor(s) may perform the necessary tasks. A code segmentmay represent a procedure, a function, a subprogram, a program, aroutine, a subroutine, a module, an object, a software package, a class,or any combination of instructions, data structures, or programstatements. A code segment may be coupled to another code segment or ahardware circuit by passing and/or receiving information, data,arguments, parameters, or memory contents. Information, arguments,parameters, data, etc. may be passed, forwarded, or transmitted via anysuitable means including memory sharing, message passing, token passing,network transmission, etc.

Embodiments of the present disclosure provide a unique and novelarchiving system. Embodiments include an archiving system having harddisk drives embedded in removable disk cartridges, referred to simply asremovable disk drives. The removable disk drives allow for expandabilityand replacement such that the archiving system need not be duplicated toadd new or more storage capacity. Further, the removable disk drivesprovide advantages in speed and data access because, in embodiments, thedata is stored and retrieved by random access rather than sequentialaccess. In embodiments, the removable disk drives are electricallyconnected to one or more drive ports that are separately addressable.The archiving system can create application layer partitions thatassociate the application layer partitions with one or more drive ports.Each application layer partition, in embodiments, has a separate set ofcontrols that allow for customized storage of different data within asingle archiving system. These and further advantages will be evident toone skilled in the art from a review of the detailed descriptionprovided herein.

Further, the present disclosure generally provides an archiving systemwith an active archive. The active archive provides for short-termstorage of archived data in a system where the archived data can beeasily retrieved and provides information about files that have beenremoved from the active archive from a set of metadata about the filecalled a “stub.” In embodiments, the active archive also includes one ormore application layer partitions that mirror the application layerpartitions created from the one or more removable disk drives.Embodiments of the active archive has limited storage capacity andeliminates of data from the active archive on a periodic basis. Thepresent disclosure also generally provides systems and methods foreliminating data in the active archive.

An embodiment of a removable disk system 100 to provide long-termarchival data storage is shown in FIG. 1. A removable disk drive 102-1provides storage capability for the removable disk system 100. Inembodiments, the removable disk drive 102-1 includes a data cartridgecase 108 and an embedded memory 104, which may be an embedded hard diskdrive (HDD), solid state disk (SSD), solid state drive, or flash memory.The HDD or flash memory 104 provides a random access memory for storageof archived data. The embedded, memory 104 is in communication withand/or electrically connected to a connector 106. In one embodiment, theconnector is a Serial Advanced Technology Attachment (SATA) connector.In other embodiments, the connector is a Universal Serial Bus (USB)connector, parallel connector, Firewire connector, or other connector.Both the embedded memory 104 and connector 106 are, in embodiments,physically attached to the data cartridge case 108, and, in someembodiments, enclosed, protected, connected or integrated by the datacartridge case 108. In other embodiments, the embedded memory 104 andthe connector 106 are a physically integrated component, and theconnector protrudes from the data cartridge case 108. The data cartridgecase 108, in embodiments, provides a solid container for the embeddedmemory 104 that also functions as an easily swappable or changed casewhen interchanging removable disk drives 102-1 in the removable disksystem 100.

In embodiments, the removable disk system 100 contains a drive port110-1 that includes one or more data cartridge ports 112, each with adata cartridge connector 114 to receive the removable disk drive 102-1.The data cartridge connector 114 mates with the electrical connector 106of the removable disk drive 102-1 to provide an electrical connection tothe removable disk drive 102-1 and/or to communicate with the embeddedmemory 104 in the removable disk drive 102-1. As with the electricalconnector 106, the data cartridge connector 114 may be a SATA connectoror another type of connector. Regardless, the data cartridge connector114 and the electrical connector 106 can be physically and/orelectrically connected. The data cartridge port 112 allows the datacartridge case 108 of the removable disk drive 102-1 to be easilyinserted and removed as necessary. In embodiments, the drive port 110-1includes two or more data cartridge ports 112 to allow for the use,control and communication with two or more removable disk drives 102-1.Each drive port 110-1, in embodiments, is separately addressable toallow for customized control over each removable disk drive 102-1connected to each data cartridge port 112. Thus, as removable diskdrives 102-1 are replaced, the same controls can be applied to the newlyinserted removable disk drives 102-1 because the drive port 110-1 isaddressed instead of the removable disk drives 102-1.

The embedded memory 104, in embodiments, includes metadata 118 storedthereon. The metadata 118 can comprise one or more of, but is notlimited to, cartridge and/or embedded memory 104 identification,encryption keys or data, other security information, informationregarding data stored on the embedded memory 104, information about thedata format used for the embedded memory 104, etc. The metadata 118 maybe read and used by the firmware 116 of the drive port 110-1. Thefirmware 116 may be hardware and/or software resident in the drive port110-1 for controlling the removable disk drive 102-1. In embodiments,the firmware 116 contains the necessary software and/or hardware topower-up the removable disk drive 102-1, spin-up the disk platters inthe embedded memory 104, read and write to the embedded memory 104,read, write and process the metadata 118, etc. For example, the firmware116 could read the metadata 118 to identify the removable disk drive102-1 and gather information related to its contents.

In embodiments, the removable disk system 100 operates to receive one ormore removable disk drives 102-1 in the one or more drive ports 110-1.The electrical connector 106 physically connects or couples with thedata cartridge connector 114 to form an electrical connection thatallows the drive port 110-1 to communicate with the embedded memory 104.The firmware 116 powers-up the embedded memory 104 and begins anyinitialization processes (e.g., security processes, identificationprocesses, reading and/or writing to the metadata 118, etc.). The driveport 110-1, which, in embodiments, is in communication with a network,receives archival data from one or more servers, applications, or otherdevices or systems on the network. The firmware 116 writes the archivaldata to the embedded memory 104 of the removable disk drive 102-1 toarchive the data.

An embodiment of the hardware architecture of an archiving system 200 isshown in FIG. 2. The archiving system 200, in embodiments, comprises anetwork storage system 202 in communication with one or more systems viaa network 204. In embodiments, the systems that communicate with thenetwork storage system 202 comprise applications, application servers,other servers, peripherals, other devices and other systems that archivedata on the network storage system 202. For example, application server1 206 and/or application server 2 208 store archival data on the networkstorage system 202. An application server 206 or 208 may be anapplication, peripheral device, system, network component, or othersoftware function or hardware device that may store archived data.Hereinafter, all functions, systems, processes, hardware devices thatmay store archived data will be referred to as an application orapplication server. Application server 1 206 and application server 2208 will hereinafter be used to describe the functions of the archivingsystem 200 but are not meant to limit the description to the exemplaryembodiments set forth herein.

The network storage system 202 comprises one or more components that maybe encompassed in a single physical structure or be comprised ofdiscrete components. In embodiments, the network storage system 202includes an archiving system appliance 210 and one or more removabledisk drives 102-2 connected or in communication with a drive port 110-2.In alternative embodiments, a modular drive bay 212 and/or 214 includestwo or more drive ports 110-2 that can each connect with a removabledisk drive 102-2. Thus, the modular drive bays 212 and 214 provide addedstorage capacity because more than one removable disk drive 102-2 can beinserted and accessed using the same archiving system appliance 210.Further, each drive port 110-2 in the modular drive bays 212 and 214are, in embodiments, separately addressable allowing the archivingsystem appliance 210 to configure the removable disk drives 102-2 in themodular drive bays 212 and 214 into groups of one or more removable diskdrives 102-2. Two or more modular drive bays 212 and 214, inembodiments, are included in the network storage system 202, asevidenced by the ellipses 218. Thus, as more data storage capacity isrequired, more modular drive bays 212 and 214 may be added to thenetwork storage system 202.

The exemplary hardware architecture in FIG. 2 provides near limitlesscapacity as more removable disk drives 102-2 can be added to existingmodular drive bays 212 or 214 until the modular drive bays 212 and 214hold all possible removable disk drives 102-2. Then, more modular drivebays 212 and 214 are added to the network storage system 202. Further,removable disk drives 102-2 may be replaced as the removable disk drives102-2 near their storage capacity. The removed disk drives 102-2, inembodiments, are physically stored if and until the data on theremovable disk drives 102-2 needs to be retrieved. If the data on theremovable disk drive 102-2 needs to be retrieved, the removable diskdrive 102-2 may be inserted into one of the drive ports 110-2 of themodular drive bay 212 or 214, and the information retrieved from theconnected removable disk drive 102-2.

The archiving system appliance 210, in embodiments, is a serveroperating as a file system. The archiving system appliance 210 may beany type of computing system having a processor and memory and operableto complete the functions described herein. An example of a server thatmay be used in the embodiments described herein is the PowerEdge™ 2950Server offered by Dell Incorporated of Austin, Tex. The file systemexecuting on the server may be any type of file system, such as the NTFile System (NTFS), that can complete the functions described herein.

The archiving system appliance 210, in embodiments, is a closed systemthat only allows access to the network storage system 202 byapplications or other systems and excludes access by users. Thus, thearchiving system appliance 210 provides protection to the networkstorage system 202.

In embodiments, the two or more modular drive bays 212 and/or 214,having each one or more inserted removable disk drives 102-2, form aremovable disk array (RDA) 232-1. The archiving system appliance 210 canconfigure the RDA 232-1 into one or more independent file systems. Eachapplication server 206 or 208 requiring archiving of data may beprovided a view of the RDA 232-1 as one or more independent filesystems. In embodiments, the archiving system appliance 210 logicallypartitions the RDA 232-1 and logically associates one or more driveports 110-2 with each application layer partition. Thus, the one or moreremovable disk drives 102-2 comprising the application layer partitionappears as an independent file system.

In further embodiments, the archiving system appliance 210 provides aninterface for application server 1 206 and application server 2 208 thatallows the application servers 206 and 208 to communicate archival datato the archiving system appliance 210. The archiving system appliance210, in embodiments, determines where and how to store the data to oneor more removable disk drives 102-2. For example, the application server1 206 stores archival data in a first application layer drive, such as,the first three removable disk drives. The application layer drives are,in embodiments, presented to the application servers 206 and 208 asapplication layer drives where write and read permissions for any oneapplication layer drive is specific to one of the application servers.As such, the network storage system 202 provides a multiple andindependent file system to each application server 206 and 208 using thesame hardware architecture.

In alternative embodiments, the network storage system 202 alsocomprises a fixed storage 216. The fixed storage 216 may be any type ofmemory or storage media either internal to the archiving systemappliance 210 or configured as a discrete system. For example, the fixedstorage 216 is a Redundant Army of Independent Disks (RAID), such as theXtore XJ-SA12-316R-B from AIC of Taiwan. The fixed storage 216 providesan active archive for storing certain data for a short period of timewhere the data may be more easily accessed. In embodiments, thearchiving system appliance 210 copies archival data to both the fixedstorage 216 and the removable disk drive 102-2. If the data is needed inthe short term, the archiving system appliance 210 retrieves the datafrom the fixed storage 216.

The archiving system appliance 210 can also configure the active archivein the fixed storage 216 into one or more independent file systems, aswith the RDA 232-1. As explained above, each application server may beprovided a view of one of two or more independent file systems. Eachindependent file system may comprise an application layer partition inthe RDA 232-1 and a related application layer partition in the fixedstorage 216. In embodiments, the archiving system appliance 210partitions the fixed storage 216 and associates each application layerpartition in the fixed storage 216 with an associated application layerpartition in the RDA 232-1.

As explained above, the archiving system appliance 210, in embodiments,determines where and how to store the data to one or more removable diskdrives 102-2. For example, the application server 1 206 stores archivaldata in a first application layer drive, which may include storing thearchival data in the application layer partition in the fixed storage216 for easier access to the archival data. Again, the application layerdrives are, in embodiments, presented to the application servers 206 and208 where write and read permissions for any one application layer driveis specific to one of the application servers. As such, the networkstorage system 202 provides a multiple and independent file system toeach application server 206 and 208 using the same hardwarearchitecture.

In operation, application server 1 206 stores primary data into aprimary storage 228, which may be a local disk drive or other memory.After some predetermined event, the application server 1 206 reads theprimary data from the primary storage 228, packages the data in a formatfor transport over the network 204 and sends the archival data to thenetwork storage system 202 to be archived. The archiving systemappliance 210 receives the archival data and determines where thearchival data should be stored. The archival data, in embodiments, isthen sent to the related application layer partitions in both the fixedstorage 216, the RDA 232-1, which may comprise one or more of theremovable disk drives 102-2 in one or more of the drive ports 110-2. Thearchival data is written to the removable disk drive 102-2 for long-termstorage and is written to the fixed storage 216 for short-term,easy-access storage. In further embodiments, application server 2 208writes primary data to a primary storage 230 and also sends archivaldata to the network storage system 202. In some embodiments, thearchival data from application server 2 208 is stored to a differentremovable disk drive 102-2 and a different portion of the fixed storage216 because the archival data from application server 2 208 relates to adifferent application and, thus, a different application layerpartition.

A block diagram of an archiving system 300 is shown in FIG. 3. Thearchiving system 300 has one or more functional components that, inembodiments, includes a network storage system 302 in communication witha network 304-1. The network 304-1 may be any type of communicationinfrastructure, for example, one or more of, but not limited to, awide-area network (WAN), local area network (LAN), wireless LAN, theInternet, etc. The network storage system 302 may communicate with oneor more other systems coupled to, connected to or in communication withthe network 304-1. For example, the network storage system 302communicates with an application server 306. Communications betweensystems on the network 304-1 may occur by any protocol or format, forexample, Transmission Control Protocol/Internet Protocol (TCP/IP), HyperText Transfer Protocol (HTTP), etc.

The network storage system 302, in embodiments, comprises one or morefunctional components embodied in hardware and/or software. In oneembodiment, the network storage system 302 comprises an archiving system312-1 in communication with one or more drive ports 110-3 that are incommunication with one or more removable disk drives 102-3. The driveports 110-3 and removable disk drives 102-3 are similar in function tothose described in conjunction with FIG. 1. The archiving system 312-1controls the function of the one or more drive ports 110-3 and writesthe archived data to one or more predetermined removable disk drives102-3 in the one or more drive ports 110-3.

In further embodiments, the network storage system 302 comprises anarchival management system 310-1. The archival management system 310-1receives data for archiving from one or more systems on the network304-1. Further, the archival management system 310-1 determines to whichsystem or removable disk drive 102-3 the data should be archived, inwhich format the data should be saved, and how to provide security forthe network storage system 302. In embodiments, the archival managementsystem 310-1 provides a partitioned archive such that the networkstorage system 302 appears to be an independent file system to eachseparate application server 306, yet maintains the archive for multipleapplication servers 306. Thus, the archival management system 310-1manages the network storage system 302 as multiple, independent filesystems for one or more application servers 306. In embodiments, thearchival management system 310-1 and the archiving system 312-1 arefunctional components of the archiving system appliance 210 (FIG. 2).

In embodiments, the archival management system 310-1 saves archival datato both the archiving system 312-1 and an active archive 314-1. Theactive archive 314-1, in embodiments, controls, reads from and writes toone or more fixed storage devices 316 that allow easier access toarchived data. In embodiments, fixed storage 316 is similar in functionto fixed storage 216 (FIG. 2). The active archive 314-1 performs similarfunctions to the archiving system 312-1 but for the fixed storagedevices 316. In embodiments, the active archive 314-1 and the fixedstorage devices 316 are components of the hardware fixed storage system216 (FIG. 2). In alternative embodiments, the active archive 314-1partitions the fixed storage 316 to mirror the associated applicationlayer partitions in the RDA 232-2. The application layer partition(s) inthe active archive 314-1 may have boundaries associated with memoryaddresses in the fixed storage 316.

The archival management system 310-1 may also provide an intelligentstorage capability. Each type of data sent to the network storage system302 may have different requirements and controls. For example, certainorganizations, such as the SEC, Food and Drug Administration (FDA),European Union, etc., have different requirements for how certain datais archived. The SEC may require financial information to be kept forseven (7) years while the FDA may require clinical trial data to be keptfor thirty (30) years. Data storage requirements may includeimmutability (the requirement that data not be overwritten), encryption,a predetermined data format, retention period (how long the data willremain archived), etc. The archival management system 310-1 can applycontrols to different portions of the RDA 232-2 and the active archive314-1 according to user-established data storage requirements. In oneembodiment, the archival management system 310-1 creates applicationlayer partitions in the archive that span one or more removable diskdrives 102-3 and one or more portions of the fixed storage 316. All datato be stored in any one application layer partition can have the samerequirements and controls. Thus, requirements for data storage areapplied to different drive ports 110-2 (FIG. 2) in the modular drivebays 212 and 214 (FIG. 2) and to the removable disk drives 102-2 (FIG.2) stored in those drive ports 110-2 (FIG. 2). Further, the requirementsare likewise applied to different portions of the fixed storage 316 inthe active archive 314-1. If a removable disk drive is replaced, thesame storage requirements, in embodiments, are applied to thereplacement removable disk drive 102-3 because of its location in thecontrolled drive port. As such, the archival management system 310-1 canindividually maintain separate sets of data using different controls,even in different removable disk drives.

The network storage system 302 may also comprise a database 318-1 incommunication with the archival management system 310-1. The database318-1 is, in embodiments, a memory for storing information related tothe data being archived. The database 318-1 may include HDDs, ROM, RAMor other memory either internal to the network storage system 302 and/orthe archival management system 310-1 or separate as a discrete componentaddressable by the archival management system 310-1. The informationstored in the database 318-1, in embodiments, includes one or more of,but is not limited to, data identification, application serveridentification, time of storage, removable disk drive identification,data format, encryption keys, application layer partition organization,etc.

The network 304-1, in embodiments, connects, couples, or otherwiseallows communications between one or more other systems and the networkstorage system 302. For example, the application server 306 is connectedto the network storage system 302 via the network 304-1. The applicationserver 306 may be a software application, for example, an email softwareprogram, a hardware device, or other network component or system. Theapplication server 306, in embodiments, communicates with a memory thatfunctions as the application server's primary storage 308. The primarystorage 308 is, in embodiments, a HDD, RAM, ROM, or other memory eitherlocal to the application server 306 or in a separate location that isaddressable.

In embodiments, the application server 306 stores information to theprimary storage 308. After some predetermined event, such as theexpiration of some period of time, the application server 306 sends datato the network storage system 302 to archive the data. The applicationserver 306 may send the data by any network protocol, such as TCP/IP,HTTP, etc., over the network 304-1 to the network storage system 302.The data is received at the archival management system 310-1. Thearchival management system 310-1, in embodiments, sends the data to oneor both of the active archive 314-1 and/or the archiving system 312-1 tobe archived.

Embodiments of an archival management system 310-2 and an archivingsystem 312-2, including one or more components or modules, are shown inFIG. 4. In embodiments, the archival management system 310-2 comprisesone or more of a protection module 402, an active archive managementmodule 404, and an audit module 405. In embodiments, the protectionmodule 402 protects access to the archiving system 302 (FIG. 3) byapplications, application servers, or other components on the network.For example, the protection module 402 prohibits a user from accessingthe archiving system 312-2 if the archiving system 312-2 is a closedsystem. Thus, the protection module 402 may authenticate a system,determine access rights of the system, perform decryption of data, andother processes.

The active archive management module 404, in embodiments, manages datawritten to and read from the active archive 314-2. In embodiments, theactive archive management module 404 determines if archival data shouldbe written to the active archive 314-2 based on information provided bythe application server or on information stored in the database 318-2.In further embodiments, the active archive management module 404determines when data in the active archive 314-2 is removed from theactive archive 314-2, as explained in conjunction with FIGS. 6A-13.According to information in the database 318-2, one or more items ofdata may only reside in the active archive 314-2 for a predeterminedperiod of time, for example, three months. After the expiration of thepredetermined period of time, the data is removed from the activearchive 314-2 and replaced with a “stub” containing metadata about thedata leaving only the copy stored in the removable disk drive(s) forretrieval. The active archive management module 404 may also partitionthe active archive 314-2.

The audit module 405, in embodiments, stores data about the archivaldata stored in the archiving system 312-2 and active archive 314-2. Inembodiments, the audit module 405 records information, for example, theapplication server that sent the data, when the data was received, thetype of data, where in the archiving system 312-2 the data is stored,where in the active archive 314-2 the data is stored, the period of timethe data will be stored in the active archive 314-2, etc. The auditmodule 405 can provide a “chain of custody” for the archived data bystoring the information in the database 318-2.

The archiving system 312-2, in embodiments, includes one or more of anauthenticity module 406, an indexing module 408 and/or a placement/mediamanagement module 410. In embodiments, the authenticity module 406determines if a removable disk drive is safe to connect with thearchiving system 312-2. For example, the authenticity module 406 maycomplete an authentication process, such as, AES 256, a public-keyencryption process, or other authentication process, using one or morekeys to verify that the inserted removable disk drive has access to thearchiving system 312-2.

The indexing module 408, in embodiments, creates application layerpartitions in the RDA 232-1 (FIG. 2) to provide storage areas fordifferent data. For example, the indexing module 408 selects one or moreremovable disk drives to form one or more “drives”. “Drive A:\” 412 maycomprise one or more removable disk drives, while “Drive B:\” 414 and“Drive C:\” 416 may also include one or more removable disk drives. Inembodiments, each drive is associated with an application layerpartition of the RDA 232-1 (FIG. 2). There may be fewer than threeapplication layer partitions of the RDA 232-1 (FIG. 2) or more thanthree application layer partitions of the RDA 232-1 (FIG. 2).

In embodiments, each drive or application layer partition stores only apredetermined type of data that relates to one or more applicationservers. For example, Drive A:\ 412 stores email data, while Drive B:\414 stores Health Insurance Portability and Accountability Act (HIPAA)data.

In further embodiments, the active archive management module 404 createsapplication layer partitions in the active archive 314-2 that areassociated with the application layer partitions in the RDA 232-1 (FIG.2). For example, the active archive management module 404 selectsportions of the active archive 314-2 to form one or more “drives” thatare associated with the drive(s) in the RDA 232-1 (FIG. 2). Inembodiments, the active archive's “Drive A:\” 412 is associated withDrive A:\ in the RDA 232-1 (FIG. 2), while “Drive BA” 414 and “DriveC:\” 416 also are associated with Drive B:\ and Drive C:\, respectively,in the RDA 232-1 (FIG. 2). In embodiments, each active archive drive412, 414 and 416 is associated with an application layer partition ofthe active archive 314-2. There may be fewer than three applicationlayer partitions of the active archive 314-2 or more than threeapplication layer partitions of the active archive 314-2, as representedby the ellipses 418. In embodiments, each drive or application layerpartition stores the same type of data as the application layerpartitions in the RDA 232-1 (FIG. 2). Continuing the example above,Drive A:\ 412 stores email data, while Drive B:\ 414 stores clinicaltrial data, which is the same as the application layer partitions in theRDA 232-1 (FIG. 2).

The application server(s) can view the application layer partitions inboth the active archive 314-2 and the RDA 232-1 (FIG. 2) and, as such,views the active archive 314-2 and the RDA 232-1 (FIG. 2) as a virtualarchiving system with a separate, independent drive inside the activearchive 314-2 and the RDA 232-1 (FIG. 2) for the application server. Oneapplication server may only access the one or more drives related to thedata the application server archives and may not access other drives notassociated with the data the application server archives.

In further embodiments, the active archive management module 404provides controls for each drive in the active archive 314-2. How datais archived for one type of data may be different from how a second typeof data is archived. For example, an organization (e.g., the SEC) mayrequire email to be stored for seven years while the Health and HumanServices (HHS) may require HIPAA data to be stored for six (6) months.The active archive management module 404 can manage each drivedifferently to meet the requirements for the data. For example, theactive archive management module 404 may store email on drive A:\ 412for seven years and store HIPAA data on drive B:\ 414 for six months.The active archive management module 404, in embodiments, storesinformation about which portions of the active archive 314-2 comprisethe separate application layer partitions and enforces the controls onthose portions of the active archive 314-2. Other controls enforced bythe active archive management module 404 may include the format of datastored on a drive, whether data is encrypted in the active archive314-2, when and how data is erased from the active archive 314-2, etc.In a further embodiment, the indexing module 408 performs the same orsimilar functions for the RDA 232-1 (FIG. 2).

In embodiments, the placement/media management module 410 manages theremovable disk drives in the RDA 232-1 (FIG. 2). For example, theplacement/media management module 410 determines when cartridges needreplacing because the removable disk drive is at or near capacity inembodiments, the placement/media management module 410 also separatelyaddresses the removable disk drives and provides the addressinginformation to the indexing module 408 for storing data in the correctapplication layer partition.

Some organizations require that archived data be immutable, that is, thedata cannot be overwritten or deleted for a period of time. To ensuredata stored in the RDA 232-1 (FIG. 2) is immutable, the placement/mediamanagement module 410, in embodiments, enforces a Write Once Read Many(WORM) process on the removable disk drives storing immutable data. TheWORM process may comprise one or more functions that write data to theremovable disk drive in a manner that prevents it from beingoverwritten, e.g., write protection, sequential writing to disk, etc.Data for an application layer partition may require WORM enforcementaccording to the indexing module 408. The placement/media managementmodule 410 can determine what removable disk drives are associated withthe application layer partition needing WORM enforcement and enforce theWORM process on the removable disk drives associated with theapplication layer partition.

As explained in conjunction with FIG. 4, the network storage system 302(FIG. 3) can present the archive as multiple, independent file systems.Block diagrams showing embodiments of the multiple, independent filesystems are shown in FIGS. 5A-5C. The network storage system 302-2 is incommunication with the network 304-2. In embodiments, one or moreapplication servers, such as application server 1 502-1, applicationserver 2 504-1 and application server 3 506, are also in communicationwith the network 304-2 and can send archival data to the network storagesystem 302-2 via the network 304-2. The network storage system 302-2 mayinclude an RDA with one or more removable disk drives and, in someembodiments, an active archive 514-1. The active archive 514-1, inembodiments, is partitioned to create one or more application layerpartitions, such as application layer partition 508-1, 510-1 and 512. Inembodiments, the application layer partitions are viewed as storagedrives that an application server can request to have mounted to archivedata. For example, application layer partition 1 508-1 is labeled drive“A:\”, while application layer partition 2 510-1 is labeled drive “B:\”and application layer partition 3 512 is labeled drive “C:\”. Otherlabels may be used for the application layer partitions, for example, aglobally unique identifier (GUID) or other identifier may be used toidentify the application layer partitions. In the examples shown inFIGS. 5A-5C, the drive labels will be used but these examples are notmeant to limit the embodiments to that type of label or identifier forthe application layer partitions.

In embodiments, each application server 502-1, 504-1 and 506 only hasaccess and “sees” only the application layer partition into which thatapplication server 502-1, 504-1 or 506 archives data. For example, withregard to FIG. 5B, application server 1 502-2 only accesses applicationlayer partition 508-2 in the active archive 514-2. As such, toapplication server 1 502-2, the active archive 514-2 may only consist ofa single file system. The application server 1 502-2, in embodiments,asks for drive A:\ to be mounted and sends archival data over thenetwork 304-3 to only drive A:\. The application server 1 502-2 cannotsend data to other drives.

Likewise, application server 2 504-2, in embodiments, only accessesapplication layer partition 510-2, as shown in FIG. 5C. Thus, thenetwork storage system 302-4 also appears as only one file system toapplication server 2 504-2. Application server 2 504-2 may only senddata to drive B:\ 510-2 over the network 304-4. Application server 2504-2, in embodiments, does not recognize that other file systems areincluded in the active archive 514-3. By partitioning the active archive514-1, the active archive 514-3 can include multiple file systems andcan operate as independent file systems for each application serverstoring archival data.

Embodiments of a database 600 comprising one or more data structures fororganizing the network storage system into application layer partitionsis shown in FIGS. 6A-6B. In embodiments, the database 600 is similar orthe same as database 318-1 (FIG. 3). The database 600 can be a partitiontable or other data structure for storing the information describedherein. In an embodiment, the database 600, includes one or moreapplication layer partition fields 602 and 604 that represent theapplication layer partitions in the RDA and the active archive. Theremay be fewer or more than two application layer partition fields asrepresented by the ellipses 614. Each application layer partition field602 or 604 may have one or more fields representing data about theapplication layer partition represented by the application layerpartition fields 602 or 604.

In embodiments, an application layer partition field 602 may compriseone or more of, but is not limited to, an application layer partitionidentification field 606, one or more control fields 608-1, one or moredrive port fields 612, and/or one or more active archive portions fields616. In alternative embodiments, the application layer partition field602 also includes one or more folder fields 610-1. The application layerpartition identification field 606, in embodiments, includes anidentification that can be used by an application server 502-1 (FIG. 5A)to send data to the application layer partition represented by theapplication layer partition field 602. In one embodiment, theidentification is a GUID for the application layer partition. In anotherembodiment, the identification is the drive letter assigned to theapplication layer partition. For example, application layer partitionfield 602 represents application layer partition 1 508-1 (FIG.-5A), andthe application layer partition identification field 606 would be driveletter “A:\”.

Further embodiments of the application layer partition field 602includes one or more drive port fields 612. In embodiments, the one ormore drive port fields 612 associate one or more drive ports 602 (FIG.6) with the application layer partition 508-1 (FIG. 5A). The associationmay include listing the one or more interface addresses for the one ormore drive ports in the one or more drive port fields 612. In otherembodiments, a drive port is assigned a slot number or identification.The slot number may then be stored in the drive port field 612. Thedrive port fields 612 can be used by the network storage system toaddress archival data to one or more removable disk drives electricallyconnected to the one or more drive ports.

Embodiments of the application layer partition field 602 may alsoinclude one or more active archive portion fields 616. In embodiments,the one or more active archive portion fields 616 associate one or moreportions of the active archive 314-2 (FIG. 4) with the application layerpartition 508-2 (FIG. 5B). The association may include listing the oneor more memory addresses in the fixed storage 216 (FIG. 2). The memoryaddresses may comprise one memory address and one or more offsets. Thememory bounded by and or including the memory addresses represent theallowed memory space for the application layer partition. The activearchive portion fields 612 can be used by the network storage system toaddress archival data to one or more portions of the active archive314-2 (FIG. 4).

One or more control fields 608-1 and one or more folder fields 610-1, inembodiments, are also included in the application layer partition field602. The control fields 608-1 provide one or more controls for theapplication layer partition represented by the application layerpartition field 602. Likewise, the folder fields 610-1 provide adesignation of one or more folders that can be used for storing data inthe application layer partition represented by the application layerpartition field 602. Embodiments of the control fields 608-1 are furtherdescribed in conjunction with FIG. 8B.

An embodiment of one or more control fields 608-2 is shown in FIG. 6B.The control fields 608-2 may include one or more of, but are not limitedto, a protection copies field 616, a data type field 618, a residencyfield 620, a default duration field 622, an audit trail field 624, anencryption field 626, and an inherit field 628. The protection copiesfield 616, in embodiments, includes a number of copies that need to bekept of the data. For example, if there is a two (2) in the protectioncopies field 616, two copies of the application layer partition or ofthe data within the application layer partition is maintained in theactive archive and the RDA.

The data type field 618, in embodiments, represents how the data ismaintained. For example, the data type field 618 includes a designationthat the data in the application layer partition is WORM data. As such,all data in the application layer partition is provided WORM protection.In alternative embodiments, the data type field 618 may also describethe type of data stored, such as, email data, HIPAA data, etc.

In embodiments, the residency field 620 is the storage requirements forthe data in the active archive. The data in the active archive can havea residency time, a duration the data is kept in the active archive,that is different from the time the data is kept in the RDA. Forexample, the data in the active archive may be kept for three (3)months, while the same data stored in the RDA stays in the RDA for two(2) years. Further, some data in the active archive can be permanentresidency data, data that is never deleted from the active archive. Aflag, in embodiments, is set in the residency field 620 to represent thedata as permanent residency data.

The default duration field 622, in embodiments, sets a duration formaintaining the data in the RDA. For example, an outside organizationmay require the data in the application layer partition to be maintainedfor six (6) months. The default duration field 622 is set to six monthsto recognize this limitation.

The audit trail field 624, in embodiments, is a flag that, if set,requires an audit trail to be recorded for the data. In embodiments, theaudit trail includes a log or record of every action performed in theRDA or active archive that is associated with the data. For example, thetime the data was stored, any access of the data, any revision to thedata, or the time the data was removed would be recorded in the audittrail. In other embodiments, the audit trail field 624 comprises therecord or log of the audit trail.

In embodiments, the encryption field 626 comprises a flag of whether thedata in the application layer partition is encrypted. If the flag isset, the data is encrypted before storing the data into the RDA or theactive archive. In alternative embodiments, the encryption field 626also includes the type of encryption, for example, AES 256, the publickey used in the encryption, etc., and/or the keys for encryption.

An inherit field 628, in embodiments, comprises a flag that, if set,requires that all folders in the application layer partition use thecontrols set in the application layer partition field 608-2. Inembodiments, the inheritance flag 628 represents that only thosecontrols that are set are inherited by a folder in the application layerpartition. In other embodiments, if the flag is set, the folders use thecontrols in the folder fields 610 instead of the controls in theapplication layer partition field 608-2. The ellipses 644 represent thatother controls may exist.

Bar diagrams representing embodiments of the memory in an active archive700 are shown in FIGS. 7A-C. The bar diagrams represent the amount oftotal storage capacity in the active archive 700 where the bottom of thediagram represents the first memory address and the top of the bardiagram represents the last memory address and the total capacity 702for the active archive. The total storage capacity may consist of thestorage capacity of one or more memories, for example, four HDDs.

The amount of archived data currently stored in the active archive 700-1is represented by the currently stored data bar 704. The currentlystored data bar 704 represents how much of the total capacity 702 iscurrently being used by archived data. A portion of the currently storeddata 704 may be permanent residency data represented by the permanentresidency bar 706. Permanent residency data may be any data that shouldnot be removed from the active archive 700 and that should be availablefor access permanently.

In embodiments, one or more limits or marks are created to determinewhen data in the active archive 700 should be removed and replaced witha stub or eliminated. For example, an active archive high occupancy mark(HOM) 708 is a percentage of the total capacity 702 of the activearchive 700-1. If the amount of the currently stored data 704 is agreater percentage of the total capacity 702 of the active archive 700-1than the active archive HOM 708, then some archived data may need to beeliminated. For example, if the currently stored data 704 is 90% of thetotal capacity 702 of the active archive 700-1 and the active archiveHOM 708 is set at 85%, then archived data needs to be eliminated.Hereinafter, the currently stored data 704 will be said to cross overthe active archive HOM 708 when the percentage of storage used by thecurrently stored data 704 is more than the percentage set for the activearchive HOM 708.

An active archive low occupancy mark (LOM) 710, in embodiments,represents another set percentage of the total capacity 702 of theactive archive 700-1. Unlike the active archive HOM 708, the activearchive LOM 710 represents a threshold for when removal of archived datashould stop. In other words, if the active archive HOM 708 is crossedover, archived data in the active archive 700-1 is removed. The processfor removing the archived data should continue until the currentlystored data 704 is less than the active archive LOM 710. Once the activearchive LOM 710 is, in embodiments, crossed over, the removal ofarchived data stops.

Another embodiment of the active archive 700-2 is shown in FIG. 7B. Inthis example, the active archive 700-2 is partitioned into four (4)application layer partitions: application layer partition A:\ 712-1,application layer partition B:\ 714-1, application layer partition C:\716-1, and application layer partition D:\ 718-1. There may be more orfewer application layer partitions in the active archive 700. Inembodiments, each application layer partition 712-1, 714-1, 716-1, or718-1 has an amount of currently stored data 720-1, 720-2, 720-3, 720-4.In one or more application layer partitions, part of the currentlystored data 720 may be permanent residency data 722-1, 722-2, 722-3. Inone embodiment, to determine if the currently stored data 720 is greaterthan the active archive HOM 708, the percentage of each of theapplication layer partitions currently stored data 720 is added andcompared to the active archive HOM 708. A similar process may be used todetermine if the currently stored data 720 is lower than the activearchive LOM 708.

Still another embodiment of the active archive 700-3 is shown in FIG.7C. In the exemplary embodiment, the active archive 700-3 is againpartitioned into four (4) different application layer partitions:application layer partition A:\ 712-2, application layer partition B:\714-2, application layer partition C:\ 716-2, and application layerpartition D:\ 718-2. The active archive 700-3 may have an active archiveHOM 724 and an active archive LOM 726. In alternative embodiments, eachapplication layer partition 712-2, 714-2, 716-2, or 718-2 also containsan application layer partition HOM 728-1, 728-2, 728-3, 728-4 and anapplication layer partition LOM 730-1, 730-2, 730-3, 730-4. Theapplication layer partition HOM 728 and an application layer partitionLOM 730 can perform the same function as the active archive HOM 724 andthe active archive LOM 726 but on an application layerpartition-by-application layer partition basis. In embodiments, the oneor more application layer partitions 712-2, 714-2, 716-2, or 718-2 donot include all the storage available in the active archive 700-3 andthere remains unused “free space” 732.

Further embodiments of the active archive 800 are shown in FIGS. 8A-B.The bar diagrams in FIGS. 8A and 8B may represent the entire activearchive 800 or an application layer partition within the active archive800. The bar diagrams also show the active archive 800-1 before data iseliminated from the active archive 800-1 and after data has beeneliminated from the active archive 800-2. In embodiments, the archivedata is eliminated by replacing data with a stub file.

The total amount of data stored in the active archive or applicationlayer partition 800-1 is represented by line 802. The total amount ofstorage may be a sequential set of data or may be, as shown in theexample in FIGS. 8A and 8B, a set of files stored in the active archiveor application layer partition 800. In the exemplary embodiment, theactive archive 800-1 stores five (5) files: file A 804, file B 806, fileC 808, file D 810, and file E 812. The files, in embodiments, eachinclude a file identifier, file attributes, and file data.

In embodiments, the active archive or application layer partition 800includes a HOM 814 and a LOM 816. To reduce the amount of storage beingused in the active archive or application layer partition 800-1, one ormore of the files 804, 806, 808, 810, or 812 has data eliminated suchthat the total storage 802 is below the HOM 814. As such, each file isanalyzed to determine which file may have data eliminated and replacedwith a stub file. In embodiments, a stub file provides the fileidentifier and, in embodiments, one or more file attributes but does notinclude the archived data. Embodiments of methods for determining whichfiles to stub are described in conjunction with FIGS. 10-11.

One or more files, in embodiments, are determined to have archived dataeliminated. In the exemplary embodiment, file A 804-2 is determined tobe permanent residency data and is not altered. File C 808-2 and file D810-2, while not permanent residency data, are determined to not bealtered. In contrast, file B 806-2 and file E 812-2 have a file stubreplace the preexisting file 806-1 and 812-1. As can be seen in theexample, the amount of storage used by the file B 806-2 and file E 812-2is greatly reduced. The total storage used 802-2 is below the LOM 816-2by stubbing just the two files, file B 806-2 and file E 812-2. Again, itshould be noted that the above process of reducing data storage can beused in either the active archive or in an application layer partitionof the active archive.

An embodiment of an active archive 900 having one or more datastructures for one or more stubbed files is shown in FIG. 9. In theexample presented in FIG. 9, there are five data structures 902, 904,906, 908, and 910 that represent files stored in an active archive 900that are associated with the files 804-2, 806-2, 808-2, 810-2, or 812-2in FIG. 8B. There may be fewer files than those shown in the exemplaryactive archive 900 or one or more other files may exist in the activearchive 900 as represented by the ellipses 912.

In embodiments, a file data structure 902 may comprise a file identifier914, file metadata 916, and file data 918. A file identifier 914 may beany identifier of the file, for example a file GUID. The file metadata916, in embodiments, includes the information or attributes about thefile, for example, the file size, file location, file save date andtime, file creation date and time, file creator, etc. File data 918 caninclude the archived data sent from the application server. Inembodiments, file A 902, file C 906 and file D 908 include file data.

File B 904 and file E 910, in embodiments, have been converted into stubfiles. In embodiments, a stub file has at least a portion of the filedata eliminated. The archival management system 310-1 (FIG. 3) or theactive archive system 314-1 (FIG. 3) may create the stub files. In oneembodiment, the file data is replaced with a pointer 920, which providesa link or other information to retrieve the file from another location,for example, the RDA 232-1 (FIG. 2). In other embodiments, the file datais eliminated without replacing the file data with a pointer 920. Otherparts of the file may also be eliminated, for example, the file metadataand/or the file identifier. If the file identifier or file metadata iseliminated, a record of the file, in embodiments, is recorded in thedatabase 318-1 (FIG. 3) to ensure that the archival management system310-1 (FIG. 3) does not search for the file in the active archive 900.

An embodiment of a method 1000 for creating stub files is shown in FIG.10. In embodiments, the method generally begins with a START operation1002 and terminates with an END operation 1014. The steps shown in themethod 1000 may be executed in a computer system as a set of computerexecutable instructions. While a logical order is shown in FIG. 10, thesteps shown or described can, in some circumstances, be executed in adifferent order than presented herein.

Read operation 1004 reads the metadata for one or more files. Inembodiments, the archival management system 310-1 (FIG. 3) determinesthat files need to be stubbed. The archival management system 310-1(FIG. 3) can read the metadata 916 (FIG. 9) for one or more files 902(FIG. 9), 904 (FIG. 9), 906 (FIG. 9), 908 (FIG. 9), and 910 (FIG. 9). Inembodiments, the archival management system 310-1 (FIG. 3) stores themetadata 916 (FIG. 9) or portions of the metadata 916 (FIG. 9) into atemporary memory structure for future operations.

Compare operation 1006 compares at least a portion of the metadata. Inone embodiment, the archival management system 310-1 (FIG. 3) accessesthe temporary memory structure containing the metadata 916 (FIG. 9) andcompares the date and/or time that the file was stored in the activearchive 900 (FIG. 9) for the several files 902 (FIG. 9), 904 (FIG. 9),906 (FIG. 9), 908 (FIG. 9), and 910 (FIG. 9). Determine operation 1008determines which file has the earliest storage date and/or time. Inembodiments, the archival management system 310-1 (FIG. 3) compares afirst save date and/or time for a first file 902 (FIG. 9) to a secondsave date and/or time for a second file 904 (FIG. 9). The file with theolder save date and/or time is kept and compared to a third file's 906(FIG. 9) save date and/or time. This process of walking through thefiles continues until no further files are available for comparison. Theremaining file is determined to have the oldest save date and/or time.As one skilled in the art will recognize, other methods for comparingthe files are possible and contemplated.

Create operation 1010 creates a stub file for the file with the oldestsave date and/or time. The archival management system 310-1 (FIG. 3), inembodiments, replaces the file data 918 (FIG. 9) with a pointer 920(FIG. 9). The pointer 920 (FIG. 9) includes less data than the file data918 (FIG. 9). In other embodiments, the archival management system 310-1(FIG. 3) erases all the file data 918 (FIG. 9) without replacing thefile data 918 (FIG. 9) with a pointer 920 (FIG. 9). In still otherembodiments, the archival management system 310-1 (FIG. 3) erases theentire file 902 (FIG. 9) and records the action into the database 318-1(FIG. 3) for later determination that the file has been stubbed. Thearchival management system 310-1 (FIG. 3) may also erase other parts ofthe file 902 (FIG. 9), such as the metadata 916 (FIG. 9).

Determine operation 1012 determines if another file needs to be stubbed.In embodiments, the archival management system 310-1 (FIG. 3) determinesif the currently stored data 802 (FIG. 8) is below a LOM 816 (FIG. 8),as explained in conjunction with FIGS. 12-14. If there are more filesthat need to be stubbed, the method flows YES back to determineoperation 1008. If there are no more files that need to be stubbed, themethod flows NO to terminate 1014.

Another embodiment of a method 1100 for creating stub files is shown inFIG. 11. In embodiments, the method 1100 generally begins with a STARToperation 1102 and terminates with an END operation 1114. The stepsshown in the method 1100 may be executed in a computer system as a setof computer executable instructions. While a logical order is shown inFIG. 11, the steps shown or described can, in some circumstances, beexecuted in a different order than presented herein.

Read operation 1104, as with read operation 1004 (FIG. 10), reads themetadata for one or more files. In embodiments, the archival managementsystem 310-1 (FIG. 3) determines that files need to be stubbed. Thearchival management system 310-1 (FIG. 3) can read the metadata 916(FIG. 9) for one or more files 902 (FIG. 9), 904 (FIG. 9), 906 (FIG. 9),908 (FIG. 9), and 910 (FIG. 9). In embodiments; the archival managementsystem 310-1 (FIG. 3) stores the metadata 916 (FIG. 9) or portions ofthe metadata 916 (FIG. 9) into a temporary memory structure for futureoperations.

Compare operation 1106 compares at least a portion of the metadata. Inone embodiment, the archival management system 310-1 (FIG. 3) accessesthe temporary memory structure containing the metadata 916 (FIG. 9) andcompares the date and/or time that the file was last accessed in theactive archive 900 (FIG. 9) for the several files 902 (FIG. 9), 904(FIG. 9), 906 (FIG. 9), 908 (FIG. 9), and 910 (FIG. 9). Determineoperation 1008 determines the earliest file access time. In embodiments,the archival management system 310-1 (FIG. 3) compares a last accessdate and/or time for a first file 902 (FIG. 9) to a last access dateand/or time for a second file 904 (FIG. 9). The file with the older lastaccess date and/or time is kept and compared to a third file's 906 (FIG.8) last access date and/or time. This process of walking through thefiles continues until no further files are available for comparison. Theremaining file is determined to have the oldest last access date and/ortime. As one skilled in the art will recognize, other methods forcomparing the files are possible and contemplated.

Create operation 1110 creates a stub file for the file with the oldestlast access date and/or time. The archival management system 310-1 (FIG.3), in embodiments, replaces the file data 918 (FIG. 9) with a pointer920 (FIG. 9). The pointer 920 (FIG. 9) includes less data than the filedata 918 (FIG. 9). In other embodiments, the archival management system310-1 (FIG. 3) erases all the file data 918 (FIG. 9) without replacingthe file data 918 (FIG. 9) with a pointer 920 (FIG. 9). In still otherembodiments, the archival management system 310-1 (FIG. 3) erases theentire file 902 (FIG. 9) and records the action into the database 318-1(FIG. 3) for later determination that the file has been stubbed. Thearchival management system 310-1 (FIG. 3) may also erase other parts ofthe file 902 (FIG. 9), such as the metadata 916 (FIG. 9).

Determine operation 1112 determines if another file needs to be stubbed.In embodiments, the archival management system 310-1 (FIG. 3) determinesif the currently stored data 802 (FIG. 8) is below a LOM 816 (FIG. 8),as explained in conjunction with FIGS. 12-14. If there are more filesthat need to be stubbed, the method flows YES back to determineoperation 1108. If there are no more files that need to be stubbed, themethod flows NO to terminate 1114.

An embodiment of a method 1200 for determining if one or more stub filesshould be created is shown in FIG. 12. In embodiments, the method 1200generally begins with a START operation 1202 and terminates with an ENDoperation 1212. The steps shown in the method 1200 may be executed in acomputer system as a set of computer executable instructions. While alogical order is shown in FIG. 12, the steps shown or described can, insome circumstances, be executed in a different order than presentedherein.

Store operation 1204 stores archival data. In embodiments, the archivalmanagement system 310-1 (FIG. 3) receives archival data from anapplication server 306 (FIG. 3) and stores the archival data in theactive archive 314-1 (FIG. 3).

Determine operation 1206 determines if the active archive HOM has beencrossed. The archival management system 310-1 (FIG. 3), in embodiments,compares the percentage of storage used by the currently stored data802-1 (FIG. 8A) with the active archive HOM 814-1 (FIG. 8A) or 708 (FIG.7A). In embodiments, the archival management system 310-1 (FIG. 3) makesthe comparison after every store operation 1204. In alternativeembodiments, the archival management system 310-1 (FIG. 3) makes thecomparison periodically, for example, every day, every week, etc. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is more than the percentage set for the active archive HOM 814-1 (FIG.8A), the method flows YES to create operation 1208. If the percentage ofstorage used by the currently stored data 802-1 (FIG. 8A) is not morethan the percentage set for the active archive HOM 814-1 (FIG. 8A), themethod flows NO back to store operation 1204 to store more archival datain the active archive.

Create operation 1208 creates at least one stub file. In embodiments,the archival management system 310-1 (FIG. 3) determines one file in theactive archive to eliminate one or more portions of the file. Thecreation of the stub file may be as explained in conjunction with FIGS.10-11.

Determine operation 1210 determines if the active archive LOM has beencrossed. In embodiments, the archival management system 310-1 (FIG. 3),after creating a stub file, compares the percentage of storage used bythe currently stored data 802-2 (FIG. 8B) with the active archive LOM816-2 (FIG. 8B). The archival management system 310-1 (FIG. 3) may makethe comparison after making each stub file or, in other embodiments, maymake the comparison after creating two or more stub files. If thepercentage of storage used by the currently stored data 802-2 (FIG. 8B)is more than the percentage set for the active archive LOM 816-2 (FIG.8B), the method flows NO back to create operation 1208. If thepercentage of storage used by the currently stored data 802-2 (FIG. 8B)is not more than the percentage set for the active archive LOM 816-2(FIG. 8B), the method flows YES to terminate 1212.

Another embodiment of a method 1300 for determining if one or more stubfiles should be created is shown in FIG. 13. In embodiments, the method1300 generally begins with a START operation 1302 and terminates with anEND operation 1312. The steps shown in the method 1300 may be executedin a computer system as a set of computer executable instructions. Whilea logical order is shown in FIG. 13, the steps shown or described can,in some circumstances, be executed in a different order than presentedherein.

Store operation 1304 stores archival data. In embodiments, the archivalmanagement system 310-1 (FIG. 3) receives archival data from anapplication server 306 (FIG. 3) and stores the archival data in anapplication layer partition 712-1 (FIG. 7B).

Determine operation 1306 determines if the application layer partitionHOM has been crossed. The archival management system 310-1 (FIG. 3), inembodiments, compares the percentage of storage used by the currentlystored data 802-1 (FIG. 8A) with the application layer partition HOM814-1 (FIG. 8A) or 728-1 (FIG. 7C). In embodiments, the archivalmanagement system 310-1 (FIG. 3) makes the comparison after every storeoperation 1304 into the application layer partition. In alternativeembodiments, the archival management system 310-1 (FIG. 3) makes thecomparison periodically, for example, every day, every week, etc. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is more than the percentage set for the application layer partition HOM814-1 (FIG. 8A), the method flows YES to create operation 1308. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is not more than the percentage set for the application layer partitionHOM 814-1 (FIG. 8A), method flows NO back to store operation 1304 tostore more archival data in the application layer partition of theactive archive.

Create operation 1308 creates a stub file. In embodiments, the archivalmanagement system 310-1 (FIG. 3) determines at least one file in theapplication layer partition to eliminate one or more portions of thefile. The creation of the stub file may be as explained in conjunctionwith FIGS. 10-11.

Determine operation 1310 determines if the application layer partitionLOM has been crossed. In embodiments, the archival management system310-1 (FIG. 3), after creating a stub file, compares the percentage ofstorage used by the currently stored data 802-1 (FIG. 8A) with theapplication layer partition LOM 816-1 (FIG. 8A) or 730-1 (FIG. 7C). Thearchival management system 310-1 (FIG. 3) may make the comparison aftermaking each stub file in the application layer partition or, in otherembodiments, may make the comparison after creating two or more stubfiles. If the percentage of storage used by the currently stored data802-1 (FIG. 8A) is more than the percentage set for the applicationlayer partition LOM 816-1 (FIG. 8A), the method flows NO back to createoperation 1308. If the percentage of storage used by the currentlystored data 802-1 (FIG. 8A) is not more than the percentage set for theapplication layer partition LOM 816-1 (FIG. 8A), the method flows YES toterminate 1312.

Still another embodiment of a method 1400 for determining if one or morestub files should be created is shown in FIG. 14. In embodiments, themethod 1400 generally begins with a START operation 1402 and terminateswith an END operation 1416. The steps shown in the method 1400 may beexecuted in a computer system as a set of computer executableinstructions. While a logical order is shown in FIG. 14, the steps shownor described can, in some circumstances, be executed in a differentorder than presented herein.

Store operation 1404 stores archival data into an application layerpartition. In embodiments, the archival management system 310-1 (FIG. 3)receives archival data from an application server 306 (FIG. 3) andstores the archival data in the application layer partition 712-1 (FIG.7B).

Determine operation 1406 determines if the application layer partitionHOM has been crossed. The archival management system 310-1 (FIG. 3), inembodiments, compares the percentage of storage used by the currentlystored data 802-1 (FIG. 8A) with the application layer partition HOM814-1 (FIG. 8A) or 728-1 (FIG. 7C). In embodiments, the archivalmanagement system 310-1 (FIG. 3) makes the comparison after every storeoperation 1404 into the application layer partition. In alternativeembodiments, the archival management system 310-1 (FIG. 3) makes thecomparison periodically, for example, every day, every week, etc. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is more than the percentage set for the application layer partition HOM814-1 (FIG. 8A), the method flows YES to determine operation 1408. Ifthe percentage of storage used by the currently stored data 802-1 (FIG.8A) is not more than the percentage set for the application layerpartition HOM 814-1 (FIG. 8A), the method flows NO back to storeoperation 1404 to store more archival data in the active archive.

Determine operation 1408 determines if the active archive HOM has beencrossed. The archival management system 310-1 (FIG. 3), in embodiments,compares the percentage of storage used by the currently stored data802-1 (FIG. 8A) with the active archive HOM 814-1 (FIG. 8A) or 708 (FIG.7A). In embodiments, the archival management system 310-1 (FIG. 3) makesthe comparison after every store operation 1404. In alternativeembodiments, the archival management system 310-1 (FIG. 3) makes thecomparison periodically, for example, every day, every week, etc. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is more than the percentage set for the active archive HOM 814-1 (FIG.8A), the method flows YES to determine operation 1410. If the percentageof storage used by the currently stored data 802-1 (FIG. 8A) is not morethan the percentage set for the active archive HOM 814-1 (FIG. 8A), themethod flows NO back to store operation 1404 to store more archival datain the active archive.

Determine operation 1410 determines which application layer partitionshave crossed their application layer partition HOM. The archivalmanagement system 310-1 (FIG. 3), in embodiments, again compares thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)with the application layer partition HOM 814-1 (FIG. 8A) or 728-1 (FIG.7C). In other embodiments, if the archival management system 310-1

(FIG. 3), in determine operation 1406, determines that the applicationlayer partition 712-1 (FIG. 7B) is over the application layer partitionHOM 728-1 (FIG. 7C), the archival management system 310-1 (FIG. 3)records an indicator (not shown) identifying the application layerpartition 712-1 (FIG. 7B) as over the application layer partition HOM728-1 (FIG. 7C). The archival management system 310-1 (FIG. 3) can thenretrieve this information during determine operation 1410. The recordmay be an identifier for the application layer partition 712-1 (FIG. 7B)and/or a flag representing that the application layer partition 712-1(FIG. 7B) is over the application layer partition HOM 728-1 (FIG. 7C).

Create operation 1412 creates a stub file in at least one of theapplication layer partitions that is over the application layerpartition HOM. In embodiments, the archival management system 310-1(FIG. 3) determines one file in one of the application layer partitionsto eliminate one or more portions of the file. The creation of the stubfile may be as explained in conjunction with FIGS. 10-11.

Determine operation 1414 determines if the active archive LOM has beencrossed. In embodiments, the archival management system 310-1 (FIG. 3),after creating a stub file, compares the percentage of storage used bythe currently stored data 802-1 (FIG. 8A) with the active archive LOM816-1 (FIG. 8A). The archival management system 310-1 (FIG. 3) may makethe comparison after making each stub file or, in other embodiments, maymake the comparison after creating two or more stub files. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is more than the percentage set for the active archive LOM 816-1 (FIG.8A), the method flows NO back to create operation 1412. If thepercentage of storage used by the currently stored data 802-1 (FIG. 8A)is not more than the percentage set for the active archive LOM 816-1(FIG. 8A), the method flows YES to terminate 1416.

An embodiment of a method 1500 for adjusting the application layerpartition HOM and LOM is shown in FIG. 15. In embodiments, the method1500 generally begins with a START operation 1502 and terminates with anEND operation 1508. The steps shown in the method 1500 may be executedin a computer system as a set of computer executable instructions. Whilea logical order is shown in FIG. 15, the steps shown or described can,in some circumstances, be executed in a different order than presentedherein. In embodiments, the method 1500 is a further embodiment ofdetermine operation 1306 (FIG. 13) or determine operation 1406 (FIG.14).

Determine operation 1504 determines the retrieval rate for one or moreapplication layer partitions. The archival management system 310-1 (FIG.3), in embodiments, reads the metadata 916 (FIG. 9) for files in eachapplication layer partition to determine the frequency and timing ofaccesses to the archived data file 902 (FIG. 9). In other embodiments,the archival management system 310-1 (FIG. 3) accesses data about eachapplication layer partition from the database 318-1 (FIG. 3) regardingthe frequency and timing of accesses to the application layerpartitions. The archival management system 310-1 (FIG. 3), from thisinformation, determines which application layer partitions have a higherrate of retrieval, that is, the archived data in the application layerpartition is accessed more frequently and/or more recently.

Adjust operation 1506 adjusts one or more of the application layerpartition capacity, HOM and/or LOM for one or more application layerpartitions. In embodiments, the archival management system 310-1 (FIG.3) changes the application layer partition capacity for the applicationlayer partitions that are accessed more frequently. For example, ifapplication layer partition A:\ 712-2 (FIG. 7) is accessed morefrequently and the active directory has free space 732 (FIG. 7), thesize of application layer partition A:\ 712-2 (FIG. 7) is increased touse some or all of the free space 732 (FIG. 7). In another embodiment,archival management system 310-1 (FIG. 3) increases the size ofapplication layer partition A:\ 712-2 (FIG. 7), which may be accessedmore, and decreases the size of another application layer partition thatis less often accessed.

The archival management system 310-1 (FIG. 3) may also increase the sizeof application layer partition A:\ 712-2 (FIG. 7), which may be accessedmore, and decrease the size of another application layer partition thatseldom or never crosses its application layer partition HOM.

The archival management system 310-1 (FIG. 3) may also increase the HOMfor the application layer partitions that are accessed more often,allowing the more-often used application layer partitions to keep morearchived data. In alternative embodiments, the archival managementsystem 310-1 (FIG. 3) decreases the LOM for application layer partitionsthat are accessed less often such that the application layer partitionsstub more files than other application layer partitions. One or more ofthe changes may be applied by the archival management system 310-1 (FIG.3). The changes allow for dynamic adjustment of the application layerpartitions to compensate for different usage patterns with theapplication servers.

In light of the above description, a number of advantages of theembodiments are readily apparent. A single archiving system can beorganized into two or more independent file systems that service two ormore application servers. As such, there is no need for a separatearchiving system for each application server. The flexibility offered bythe embodiments helps reduce the amount of equipment needed. Further,the granularity of management for the archive is greatly enhancedbecause each partition may have a unique and customized set of controls.In addition, the active archive can be managed to ensure that the activearchive eliminates data to ensure availability for future storage. Moreand other advantages will be apparent to one skilled in the art.

A number of variations and modifications of the embodiments can also beused. In alternative embodiments, the application layer partitions inthe active archive also include folders, with each folder having a setof customized controls. Further, active archive data may be replaced bylinks, such as by object linking and embedding (OLE), to the archiveddata in the RDA. As such, when an application desires the data in theactive archive, the request is automatically redirected to the RDA.

While the principles of the disclosure have been described above inconnection with specific apparatuses and methods, it is to be clearlyunderstood that this description is made only by way of example and notas limitation on the scope of the disclosure.

1-10. (canceled)
 11. A method, executable in a computer system, forensuring available storage in an active archive, the method comprising:storing archival data into the active archive; determining if currentlystored data crosses a high occupancy mark (HOM); if the currently storeddata crosses the HOM, creating a stub file in the active archive; and ifthe currently stored data does not cross the HOM, storing more archivaldata into the active archive.
 12. The method as defined in claim 11,further comprising: in response to creating the stub file, determiningif the currently stored data crosses a low occupancy mark (LOM); if thecurrently stored data crosses the LOM, stopping file stubbing; and ifthe currently stored data does not cross the LOM, creating a second stubfile in the active archive.
 13. The method as defined in claim 11,wherein the HOM is an active archive HOM.
 14. The method as defined inclaim 11, wherein the determining if currently stored data crosses a HOMcomprises determining if currently stored data in an application layerpartition crosses an application layer partition HOM.
 15. The method asdefined in claim 14, wherein creating a stub file in the active archivecomprises: if the currently stored data in the application layerpartition crosses the application layer partition HOM, creating a stubfile in the application layer partition; and if the currently storeddata in the application layer partition does not cross the applicationlayer partition HOM, storing more archival data in the application layerpartition.
 16. The method as defined in claim 15, further comprising: inresponse to creating the stub file, determining if the currently storeddata crosses an application layer partition low occupancy mark (LOM); ifthe currently stored data crosses the application layer partition LOM,stopping file stubbing; and if the currently stored data does not crossthe application layer partition LOM, creating a second stub file in theapplication layer partition.
 17. The method as defined in claim 14,wherein determining if currently stored data in an application layerpartition crosses an application layer partition HOM comprises:determining a retrieval rate for each application layer partition; andadjusting, for at least one application layer partition, one of a groupconsisting of an application layer partition capacity, an applicationlayer partition HOM and an application layer partition low occupancymark (LOM).
 18. The method as defined in claim 11, wherein thedetermining if currently stored data crosses a HOM comprises:determining if currently stored data in an application layer partitioncrosses an application layer partition HOM; if the currently stored datain the application layer partition does not cross the application layerpartition HOM, storing more archival data; if the currently stored datain the application layer partition crosses the application layerpartition HOM, determining if currently stored data in the activearchive crosses an active archive HOM; if currently stored data in theactive archive does not cross the active archive HOM, storing morearchival data; if currently stored data in the active archive crossesthe active archive HOM, determining which application layer partitionhas crossed the application layer partition HOM; and creating the stubfile in the application layer partition that has crossed the applicationlayer partition HOM.
 19. The method as defined in claim 18, in responseto creating the stub file, determining if the currently stored datacrosses an active archive low occupancy mark (LOM); if the currentlystored data crosses the active archive LOM, stopping file stubbing; andif the currently stored data does not cross the active archive LOM,creating a second stub file in the application layer partition.
 20. Themethod as defined in claim 11, wherein creating the stub file comprises:reading metadata for one or more files in the active archive; comparinga date, from the metadata, for when the one or more files were stored inthe active archive; determining an oldest date for when one of the fileswas stored in the active archive; and creating the stub file for thefile with the oldest date for when the file was stored in the activearchive.
 21. The method as defined in claim 11, wherein creating thestub file comprises: reading metadata for one or more files in theactive archive; comparing a date, from the metadata, for when the one ormore files was accessed in the active archive; determining an oldestdate for when one of the files was accessed in the active archive; andcreating the stub file for the file with the oldest date for when thefile was accessed in the active archive.
 22. A method, executable in acomputer system, for ensuring available storage in an active archive,the method comprising: storing archival data into the active archive,the active archive including: an active archive high occupancy mark(HOM), the active archive HOM being a first percentage of a totalstorage capacity of the active archive; and an active archive lowoccupancy mark (LOM), the active archive LOM being a second percentageof the total storage capacity of the active archive; determining ifcurrently stored data crosses the HOM; if the currently stored datacrosses the HOM, creating a stub file in the active archive, wherein oneor more files in the active archive are stubbed, and wherein stubbing offiles stops when the current data stored crosses over the active archiveLOM; and if the currently stored data does not cross the HOM, storingmore archival data into the active archive.
 23. The method of claim 22,wherein the active archive HOM is crossed over when a percentage of thestorage capacity used by the currently stored data is more than thefirst percentage of the active archive HOM.
 24. The method of claim 23,wherein the active archive LOM is crossed over when a percentage of thestorage capacity used by the currently stored data is not more than thesecond percentage of the active archive LOM.
 25. The method of claim 21,wherein creating a stub file comprises eliminating a least a portion ofdata from the one or more files that are stubbed in the active archive,and replacing the data that is eliminated with the stub file.
 26. Themethod of claim 25, further comprising storing a copy of the data thatis replaced with the stub file in the active archive, wherein the copyof the data is stored on a removable drive array.
 27. The method ofclaim 26, further comprising partitioning the removable drive array toinclude one or more application layer partitions, and partitioning theactive archive to mirror the application layer partitions of theremovable drive array.
 28. The method of claim 27, wherein eachapplication layer partition includes: an application layer partitionHOM, the application layer partition HOM being a first percentage oftotal storage capacity of the application layer partition, wherein theone or more files in the active archive are stubbed if current datastored in the application layer partition crosses over the applicationlayer partition HOM; and an application layer partition LOM, theapplication layer partition LOM being a second percentage of the totalstorage capacity of the application layer partition, wherein thestubbing of files stops when the current data stored in the applicationlayer partition crosses over the application layer partition LOM. 29.The method of claim 27, further comprising: receiving the archival datafrom one or more application servers; determining on which applicationlayer partition to store the archival data; and applying one or morecontrols to the archival data, wherein: each application layer partitionhas a separate set of controls for customized storage of different typesof the archival data in different application layer partitions withdifferent controls; and all archival data stored in any one of theapplication layer partitions has the same controls.
 30. The method ofclaim 29, wherein each of the application servers accesses only one ofthe application layer partitions, and cannot send data to others of theapplication layer partitions.